library(tidyverse)
library(plotly)
library(reshape)
library(kableExtra)
library(rmdformats)

data <- read_csv("C:/Users/edaub/Downloads/DataCollection.csv") %>%
  mutate(prop = `POINTS EARNED`/`POINTS POSSIBLE`)

This is a preliminary analysis of test score distributions formed by students enrolled in Computer Science course(s) at Abington Senior High School during the 2020-21 school year. This analysis has been prepared for aggregate informational and descriptive purposes only. It should not be used to assess individual student performance or to form predictive models.

Assessments <- unique(data$ASSESSMENT)
Content <- c("Primitive Types", "Primitive Types", "Iteration", "Array", "Array List", "2D Array")
content_specific <- data.frame(Assessments, Content)

kable(content_specific) %>%
  kable_paper(full_width = F, position = "left") %>%
  column_spec(1, bold = T, border_right = T) %>%
  column_spec(2, background = "lightblue")
Assessments Content
Unit 1 Progress Check: MCQ Part A Primitive Types
Unit 1 Progress Check: MCQ Part B Primitive Types
Unit 4 Progress Check: MCQ Iteration
Unit 6 Progress Check: MCQ Array
Unit 7 Progress Check: MCQ Array List
Unit 8 Progress Check: MCQ 2D Array

Score Distributions

data_ass <- data %>%
  group_by(`ASSESSMENT`) %>%
  mutate(avg = mean(prop)) %>%
  ungroup()

ggplot(data_ass, aes(x = prop)) + 
  geom_histogram(bins = 10) + 
  facet_wrap(~`ASSESSMENT`) +
  geom_vline(data = data_ass, aes(xintercept = avg), linetype = "dashed", color = "red4") + 
  theme_minimal() + 
  xlab("Student Scores") + 
  ylab("Number of Students")
Figure A. Distribution of student scores by assessment type

Figure A. Distribution of student scores by assessment type

Key Findings

  • A few assessments show more even distributions than others. These “evenly distributed” assessments include Unit 1 Progress Check Parts A and B.
  • The most unevenly distributed assessments include Unit 8 Progress Check and Unit 6 Progress Check. Upon preliminary viewing, this may indicate an increasing divide in learning achievement across time (i.e. students are more “even” at the beginning of the course and end up more divided as the course continues

Pass/Fail

# proportion of students below 50%
data_below <- data %>%
  group_by(`ASSESSMENT`) %>%
  mutate(below = ifelse(prop <= 0.5, "below", "above")) %>%
  group_by(`ASSESSMENT`, below) %>%
  summarize(n = n()) %>%
  mutate(prop = n/sum(n))
  
# pie charts
par(mfrow=c(2,3))
pie(data_below$prop[data_below$ASSESSMENT=="Unit 1 Progress Check: MCQ Part A"], col = rainbow(2), labels = "", xlab = "Unit 1 Progress Check: MCQ Part A")
pie(data_below$prop[data_below$ASSESSMENT=="Unit 1 Progress Check: MCQ Part B"], col = rainbow(2), labels = "", xlab = "Unit 1 Progress Check: MCQ Part B")
pie(data_below$prop[data_below$ASSESSMENT=="Unit 4 Progress Check: MCQ"], col = rainbow(2), labels = "", xlab = "Unit 4 Progress Check: MCQ")
pie(data_below$prop[data_below$ASSESSMENT=="Unit 6 Progress Check: MCQ"], col = rainbow(2), labels = "", xlab = "Unit 6 Progress Check: MCQ")
pie(data_below$prop[data_below$ASSESSMENT=="Unit 7 Progress Check: MCQ"], col = rainbow(2), labels = "", xlab = "Unit 7 Progress Check: MCQ")
pie(data_below$prop[data_below$ASSESSMENT=="Unit 8 Progress Check: MCQ"], col = rainbow(2), labels = "", xlab = "Unit 8 Progress Check: MCQ")
legend("top", legend = c("Above 50%", "Below 50%"), fill = rainbow(2), title = "Legend")
Figure B. Proportion of student pass/fails by assessment type

Figure B. Proportion of student pass/fails by assessment type

Key Findings

  • Students begin the course with a larger gap in pass/fail metrics, and end the course more evenly
  • This is reflected in the score distributions above; though score distribution becomes less even, this is due in part to the pass/fail ratio decreasing in magnitude.
data_reshape <- data %>%
  select(last_name = `LAST NAME`,
         first_name = `FIRST NAME`,
         assessment = `ASSESSMENT`,
         points = `POINTS EARNED`,
         total_points = `POINTS POSSIBLE`) %>%
  mutate(prop = points/total_points) %>%
  cast(first_name + last_name ~ assessment, mean)

lines_plot <- ggplot(data_ass, aes(x = `ASSESSMENT`, y = prop, group = `LAST NAME`, colour = `LAST NAME`)) + 
  geom_line() + 
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1)) + 
  xlab("Assessment Type") + 
  ylab("Score")

ggplotly(lines_plot)

Figure C. Interactive plot of student scores

Concluding Remarks

Based on this analyst’s understanding of educational trends and classroom assessment, this data showcases a typical learning cycle. Students begin the course on uneven footing, with scores tending to increase (with some exceptions) across time. Pass/fail ratios similarly decrease (with some exceptions) across time, indicating regular increases in student understanding.

This analyst posits that further attention should be paid to Units 6, 7, and 8, given the pass/fail ratios and uneven score distributions (relative to other assessments). Given that these were the last assessments to be administere, it holds that these would be the assessments with largest variance in student understanding and performance. Unit 1 shows a high rate of “fails” relative to other assessments, but this analyst would note that Unit 1’s second assessment shows a lower rate of these “fails”. This would indicate an acceptable degree of improvement in understanding for this unit.

Further analysis should explore the following areas:

  • Distribution of student scores between top and bottom 10%
  • Scores on free response questions relative to multiple choice (all of these assessments are MCQ-based)
  • A more detailed look at individual student progress across time
  • A more detailed look at individual student progress across content areas